Doctoral Dissertation Barge-in Robust Spoken Dialogue Interface Using Multichannel Sound Field Control and Array Signal Processing

نویسنده

  • Shigeki Miyabe
چکیده

A spoken dialogue system is demanded as a user-friendly human-machine interface that does not require any special skills in its manipulation. Speech has advantageous features: they are hands-free and eyes-free, i.e., one can use speech while doing other tasks. For effective utilization of the features, it is desirable that the system can be used even when the user stands away from the microphone or the user’s speech is uttered interrupting the output sound of the system (response sound). The problem in satisfying such demands is the degradation of automatic speech recognition (ASR) because of feedback of response sound and observation of interfering noise due to other sound than the user’s speech. Since current ASR systems are sensitive to noise, a noise reduction method is indispensable. In elimination of the response sound and the interfering noise, an acoustic echo canceller (AEC) and an adaptive beamformer (ABF) are generally used, respectively. In each of the methods, a filter is adapted to eliminate its target noise based on the minimum-mean-squared-error criterion. Thus, when their filters are trained using signals containing sources other than their target noise, their performances degrade severely. To prevent such degradation, the system should detect the times when the observed signals contain sounds other than the target noise, denoted as double-talk detection (DTD). However, accurate DTD is difficult, particularly in such a situation that both response sound and interfering ∗Doctoral Dissertation, Department of Information Processing, Graduate School of Information Science, Nara Institute of Science and Technology, NAIST-IS-DD0561031, September 30, 2007.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interface for Barge-in Free Spoken Dialogue System Combining Adaptive Sound Field Control and Microphone Array

This paper describes a new interface for a barge-in free spoken dialogue system combining an adaptive sound field control and a microphone array. In order to actualize robustness against the change of transfer functions due to the various interferences, the barge-in free spoken dialogue system which uses sound field control and a microphone array has been proposed by one of the authors. However...

متن کامل

BARGE−IN FREE SPOKEN DIALOGUE INTERFACE USING NULLSPACE−BASED SOUND FIELD CONTROL AND BEAMFORMING (ThuAmPO4)

This paper describes a new small−scale interface for a barge−in free spoken dialogue system combining a multichannel sound field control and a microphone array, in which the response sound from the system can be canceled out at the microphone points. The conventional method inhibits the user from moving because the system forces the user to stay in the fixed position where the response sound is...

متن کامل

Interface for barge-in free spoken dialogue system using adaptive sound field control

This paper describes a new interface for a barge-in free spoken dialogue system combining an adaptive sound field control and a microphone array. In order to actualize robustness against the change of transfer functions due to the various interferences, the barge-in free spoken dialogue system which uses sound field control and a microphone array has been proposed by one of the authors. However...

متن کامل

Interface for Barge-in Free Spoken Dialogue System Based on Sound Field Reproduction and Microphone Array

A barge-in free spoken dialogue interface using sound field control and microphone array is proposed. In the conventional spoken dialogue system using an acoustic echo canceller, it is indispensable to estimate a room transfer function, especially when the transfer function is changed by various interferences. However, the estimation is difficult when the user and the system speak simultaneousl...

متن کامل

Continuously Predicting and Processing Barge-in During a Live Spoken Dialogue Task

Barge-in enables the user to provide input during system speech, facilitating a more natural and efficient interaction. Standard methods generally focus on singlestage barge-in detection, applying the dialogue policy irrespective of the barge-in context. Unfortunately, this approach performs poorly when used in challenging environments. We propose and evaluate a barge-in processing method that ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007